Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

EP (CPU TensoRT CUDA) accuracy test #22545

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

jingyanwangms
Copy link
Contributor

@jingyanwangms jingyanwangms commented Oct 22, 2024

Description

This test compares output of below huggingface models

  • "microsoft/resnet-50"
  • "microsoft/Phi-3.5-mini-instruct"
    on Pytorch cpu vs [ORT CPU EP, ORT TensorRT EP, ORT CUDA] with different configurations (fp16, no ort graph optimization, 1 layer transformer vs full model / Resnet18 vs Resnet50)

Future work:

  • Integrate with existing accuracy test such as Adrian's tool
  • Troubleshoot Phi3.5 >1 layer error
  • Add more models

Motivation and Context

@jingyanwangms jingyanwangms changed the title Add TensorRT accuracy test [WIP] Add TensorRT accuracy test Oct 22, 2024
Copy link
Contributor

@github-advanced-security github-advanced-security bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lintrunner found more than 20 potential problems in the proposed changes. Check the Files changed tab for more details.

from transformers import AutoModel, AutoTokenizer
from transformers import AutoModelForCausalLM
import torch
from transformers.onnx import export

Check notice

Code scanning / CodeQL

Unused import Note test

Import of 'export' is not used.
import numpy as np
import time
import unittest
import onnx

Check notice

Code scanning / CodeQL

Module is imported with 'import' and 'import from' Note test

Module 'onnx' is imported with both 'import' and 'import from'.
Module 'onnxruntime.test.onnx' is imported with both 'import' and 'import from'.
'attention_mask': pytorch_inputs['attention_mask'].numpy(),
'onnx::Neg_2': torch.ones(1, dtype=torch.int64).numpy() # ORT requires this input since it's in the exported graph
}
return model, pytorch_inputs, ort_inputs

Check failure

Code scanning / CodeQL

Potentially uninitialized local variable Error test

Local variable 'ort_inputs' may be used before it is initialized.
def run_comparison(self, model_name, use_minimal_model=True, use_tensorrt=True, use_fp16=True, use_graph_opt=True, rtol=1e-2, atol=1e-2):
start_time = time.time()
model, pytorch_inputs, ort_inputs = get_model_and_inputs(model_name, use_minimal_model)
pytorch_output = run_model_in_pytorch(model, pytorch_inputs)

Check failure

Code scanning / CodeQL

Potentially uninitialized local variable Error test

Local variable 'model' may be used before it is initialized.
def run_comparison(self, model_name, use_minimal_model=True, use_tensorrt=True, use_fp16=True, use_graph_opt=True, rtol=1e-2, atol=1e-2):
start_time = time.time()
model, pytorch_inputs, ort_inputs = get_model_and_inputs(model_name, use_minimal_model)
pytorch_output = run_model_in_pytorch(model, pytorch_inputs)

Check failure

Code scanning / CodeQL

Potentially uninitialized local variable Error test

Local variable 'pytorch_inputs' may be used before it is initialized.
if model_name == "microsoft/Phi-3.5-mini-instruct":
fix_phi35_model(model_file)
providers = get_ep(use_tensorrt, use_fp16)
ort_output = run_model_in_ort(model_file, ort_inputs, providers, use_graph_opt=use_graph_opt)

Check failure

Code scanning / CodeQL

Potentially uninitialized local variable Error test

Local variable 'ort_inputs' may be used before it is initialized.
run_comparison(self, "microsoft/resnet-18",
use_minimal_model=False, use_tensorrt=False, use_fp16=False, use_graph_opt=False)

def test_resnet18_cpu_fp32(self):

Check warning

Code scanning / CodeQL

Variable defined multiple times Warning test

This assignment to 'test_resnet18_cpu_fp32' is unnecessary as it is
redefined
before this value is used.
@jingyanwangms jingyanwangms changed the title [WIP] Add TensorRT accuracy test [WIP] EP (CPU TensoRT CUDA) accuracy test Nov 13, 2024
@jingyanwangms jingyanwangms changed the title [WIP] EP (CPU TensoRT CUDA) accuracy test EP (CPU TensoRT CUDA) accuracy test Nov 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant